Overview

Dataset statistics

Number of variables28
Number of observations114144
Missing cells0
Missing cells (%)0.0%
Duplicate rows1683
Duplicate rows (%)1.5%
Total size in memory19.8 MiB
Average record size in memory182.0 B

Variable types

Numeric11
Categorical11
Boolean6

Warnings

Reason has constant value "RA" Constant
Is_year_start has constant value "False" Constant
Dataset has 1683 (1.5%) duplicate rows Duplicates
Id has a high cardinality: 19353 distinct values High cardinality
Applied is highly correlated with Received and 2 other fieldsHigh correlation
Received is highly correlated with Applied and 2 other fieldsHigh correlation
logapplied is highly correlated with Applied and 2 other fieldsHigh correlation
logreceived is highly correlated with Applied and 2 other fieldsHigh correlation
Year is highly correlated with ElapsedHigh correlation
Month is highly correlated with Week and 1 other fieldsHigh correlation
Week is highly correlated with Month and 1 other fieldsHigh correlation
Dayofyear is highly correlated with Month and 1 other fieldsHigh correlation
Elapsed is highly correlated with YearHigh correlation
Year is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Payment_Type is highly correlated with Payment_Method and 2 other fieldsHigh correlation
Is_quarter_end is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Is_year_end is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Age is highly correlated with AgeGroup and 2 other fieldsHigh correlation
Area is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Gender is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Is_quarter_start is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Payment_Method is highly correlated with Payment_Type and 2 other fieldsHigh correlation
AgeGroup is highly correlated with Age and 2 other fieldsHigh correlation
Is_year_start is highly correlated with Year and 14 other fieldsHigh correlation
Location is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Is_month_end is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Reason is highly correlated with Year and 14 other fieldsHigh correlation
Is_month_start is highly correlated with Is_year_start and 1 other fieldsHigh correlation
True_False is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Ratio is highly skewed (γ1 = 337.8511828) Skewed
Dayofweek has 21644 (19.0%) zeros Zeros

Reproduction

Analysis started2021-04-26 11:24:28.256103
Analysis finished2021-04-26 11:26:23.646696
Duration1 minute and 55.39 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

Applied
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1197
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean382.8623844
Minimum0.001
Maximum12000
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:23.893020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.001
5-th percentile106
Q1210
median320
Q3500
95-th percentile880
Maximum12000
Range11999.999
Interquartile range (IQR)290

Descriptive statistics

Standard deviation257.818421
Coefficient of variation (CV)0.6733971044
Kurtosis47.41735371
Mean382.8623844
Median Absolute Deviation (MAD)130
Skewness2.890217658
Sum43701444
Variance66470.33822
MonotocityNot monotonic
2021-04-26T16:56:24.373467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2102941
 
2.6%
4002653
 
2.3%
3002628
 
2.3%
5002479
 
2.2%
3502341
 
2.1%
2502255
 
2.0%
2002166
 
1.9%
6002122
 
1.9%
2261987
 
1.7%
4501771
 
1.6%
Other values (1187)90801
79.5%
ValueCountFrequency (%)
0.0011
 
< 0.1%
24
< 0.1%
32
 
< 0.1%
41
 
< 0.1%
55
< 0.1%
ValueCountFrequency (%)
120001
< 0.1%
59811
< 0.1%
43051
< 0.1%
42151
< 0.1%
41071
< 0.1%

Gender
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
F
71417 
M
42707 
GD
 
20

Length

Max length2
Median length1
Mean length1.000175217
Min length1

Characters and Unicode

Total characters114164
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowF
4th rowF
5th rowF
ValueCountFrequency (%)
F71417
62.6%
M42707
37.4%
GD20
 
< 0.1%
2021-04-26T16:56:24.847984image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-26T16:56:25.016416image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
f71417
62.6%
m42707
37.4%
gd20
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
F71417
62.6%
M42707
37.4%
G20
 
< 0.1%
D20
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter114164
100.0%

Most frequent character per category

ValueCountFrequency (%)
F71417
62.6%
M42707
37.4%
G20
 
< 0.1%
D20
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin114164
100.0%

Most frequent character per script

ValueCountFrequency (%)
F71417
62.6%
M42707
37.4%
G20
 
< 0.1%
D20
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII114164
100.0%

Most frequent character per block

ValueCountFrequency (%)
F71417
62.6%
M42707
37.4%
G20
 
< 0.1%
D20
 
< 0.1%

Payment_Method
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
AV
98803 
RP
15337 
U
 
4

Length

Max length2
Median length2
Mean length1.999964957
Min length1

Characters and Unicode

Total characters228284
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAV
2nd rowAV
3rd rowAV
4th rowAV
5th rowAV
ValueCountFrequency (%)
AV98803
86.6%
RP15337
 
13.4%
U4
 
< 0.1%
2021-04-26T16:56:25.440505image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-26T16:56:25.581129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
av98803
86.6%
rp15337
 
13.4%
u4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A98803
43.3%
V98803
43.3%
R15337
 
6.7%
P15337
 
6.7%
U4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter228284
100.0%

Most frequent character per category

ValueCountFrequency (%)
A98803
43.3%
V98803
43.3%
R15337
 
6.7%
P15337
 
6.7%
U4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin228284
100.0%

Most frequent character per script

ValueCountFrequency (%)
A98803
43.3%
V98803
43.3%
R15337
 
6.7%
P15337
 
6.7%
U4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII228284
100.0%

Most frequent character per block

ValueCountFrequency (%)
A98803
43.3%
V98803
43.3%
R15337
 
6.7%
P15337
 
6.7%
U4
 
< 0.1%

Location
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
M
51873 
NE
34025 
O
12106 
PP
12065 
U
 
4075

Length

Max length2
Median length1
Mean length1.403788197
Min length1

Characters and Unicode

Total characters160234
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowU
3rd rowM
4th rowM
5th rowM
ValueCountFrequency (%)
M51873
45.4%
NE34025
29.8%
O12106
 
10.6%
PP12065
 
10.6%
U4075
 
3.6%
2021-04-26T16:56:25.974048image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-26T16:56:26.130298image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
m51873
45.4%
ne34025
29.8%
o12106
 
10.6%
pp12065
 
10.6%
u4075
 
3.6%

Most occurring characters

ValueCountFrequency (%)
M51873
32.4%
N34025
21.2%
E34025
21.2%
P24130
15.1%
O12106
 
7.6%
U4075
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter160234
100.0%

Most frequent character per category

ValueCountFrequency (%)
M51873
32.4%
N34025
21.2%
E34025
21.2%
P24130
15.1%
O12106
 
7.6%
U4075
 
2.5%

Most occurring scripts

ValueCountFrequency (%)
Latin160234
100.0%

Most frequent character per script

ValueCountFrequency (%)
M51873
32.4%
N34025
21.2%
E34025
21.2%
P24130
15.1%
O12106
 
7.6%
U4075
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII160234
100.0%

Most frequent character per block

ValueCountFrequency (%)
M51873
32.4%
N34025
21.2%
E34025
21.2%
P24130
15.1%
O12106
 
7.6%
U4075
 
2.5%

Received
Real number (ℝ≥0)

HIGH CORRELATION

Distinct3125
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean382.8572231
Minimum0.21
Maximum12000
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:26.364610image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.21
5-th percentile106
Q1210
median320
Q3500
95-th percentile880
Maximum12000
Range11999.79
Interquartile range (IQR)290

Descriptive statistics

Standard deviation257.8189139
Coefficient of variation (CV)0.6734074697
Kurtosis47.41708631
Mean382.8572231
Median Absolute Deviation (MAD)130
Skewness2.890193348
Sum43700854.87
Variance66470.59234
MonotocityNot monotonic
2021-04-26T16:56:26.616941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2102938
 
2.6%
4002650
 
2.3%
3002624
 
2.3%
5002464
 
2.2%
3502339
 
2.0%
2502250
 
2.0%
2002160
 
1.9%
6002121
 
1.9%
2261971
 
1.7%
4501769
 
1.5%
Other values (3115)90858
79.6%
ValueCountFrequency (%)
0.211
 
< 0.1%
23
< 0.1%
2.11
 
< 0.1%
2.51
 
< 0.1%
31
 
< 0.1%
ValueCountFrequency (%)
120001
< 0.1%
59811
< 0.1%
43051
< 0.1%
42151
< 0.1%
4107.441
< 0.1%

Id
Categorical

HIGH CARDINALITY

Distinct19353
Distinct (%)17.0%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
GHI000112669
18223 
GHI000753413
 
1056
GHI001206283
 
974
GHI000143648
 
639
GHI001853440
 
588
Other values (19348)
92664 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1369728
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12685 ?
Unique (%)11.1%

Sample

1st rowGHI001040140
2nd rowGHI000096195
3rd rowGHI000873216
4th rowGHI000165164
5th rowGHI000085542
ValueCountFrequency (%)
GHI00011266918223
 
16.0%
GHI0007534131056
 
0.9%
GHI001206283974
 
0.9%
GHI000143648639
 
0.6%
GHI001853440588
 
0.5%
GHI001354498521
 
0.5%
GHI000100619521
 
0.5%
GHI001086470517
 
0.5%
GHI000086558500
 
0.4%
GHI000437510494
 
0.4%
Other values (19343)90111
78.9%
2021-04-26T16:56:27.258738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ghi00011266918223
 
16.0%
ghi0007534131056
 
0.9%
ghi001206283974
 
0.9%
ghi000143648639
 
0.6%
ghi001853440588
 
0.5%
ghi000100619521
 
0.5%
ghi001354498521
 
0.5%
ghi001086470517
 
0.5%
ghi000086558500
 
0.4%
ghi000437510494
 
0.4%
Other values (19343)90111
78.9%

Most occurring characters

ValueCountFrequency (%)
0378588
27.6%
1133816
 
9.8%
G114144
 
8.3%
H114144
 
8.3%
I114144
 
8.3%
689866
 
6.6%
273495
 
5.4%
969765
 
5.1%
461201
 
4.5%
856666
 
4.1%
Other values (3)163899
12.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1027296
75.0%
Uppercase Letter342432
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
0378588
36.9%
1133816
 
13.0%
689866
 
8.7%
273495
 
7.2%
969765
 
6.8%
461201
 
6.0%
856666
 
5.5%
355761
 
5.4%
555459
 
5.4%
752679
 
5.1%
ValueCountFrequency (%)
G114144
33.3%
H114144
33.3%
I114144
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common1027296
75.0%
Latin342432
 
25.0%

Most frequent character per script

ValueCountFrequency (%)
0378588
36.9%
1133816
 
13.0%
689866
 
8.7%
273495
 
7.2%
969765
 
6.8%
461201
 
6.0%
856666
 
5.5%
355761
 
5.4%
555459
 
5.4%
752679
 
5.1%
ValueCountFrequency (%)
G114144
33.3%
H114144
33.3%
I114144
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1369728
100.0%

Most frequent character per block

ValueCountFrequency (%)
0378588
27.6%
1133816
 
9.8%
G114144
 
8.3%
H114144
 
8.3%
I114144
 
8.3%
689866
 
6.6%
273495
 
5.4%
969765
 
5.1%
461201
 
4.5%
856666
 
4.1%
Other values (3)163899
12.0%

Reason
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
RA
114144 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters228288
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRA
2nd rowRA
3rd rowRA
4th rowRA
5th rowRA
ValueCountFrequency (%)
RA114144
100.0%
2021-04-26T16:56:27.682848image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-26T16:56:27.839057image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
ra114144
100.0%

Most occurring characters

ValueCountFrequency (%)
R114144
50.0%
A114144
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter228288
100.0%

Most frequent character per category

ValueCountFrequency (%)
R114144
50.0%
A114144
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin228288
100.0%

Most frequent character per script

ValueCountFrequency (%)
R114144
50.0%
A114144
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII228288
100.0%

Most frequent character per block

ValueCountFrequency (%)
R114144
50.0%
A114144
50.0%

Age
Categorical

HIGH CORRELATION

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
25-29
20220 
20-24
17265 
30-34
16476 
35-39
13015 
40-44
10260 
Other values (8)
36908 

Length

Max length5
Median length5
Mean length4.885819666
Min length2

Characters and Unicode

Total characters557687
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row25-29
2nd row20-24
3rd row20-24
4th row40-44
5th row25-29
ValueCountFrequency (%)
25-2920220
17.7%
20-2417265
15.1%
30-3416476
14.4%
35-3913015
11.4%
40-4410260
9.0%
45-499401
8.2%
50-547379
 
6.5%
65+5915
 
5.2%
55-595698
 
5.0%
60-644184
 
3.7%
Other values (3)4331
 
3.8%
2021-04-26T16:56:28.466287image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25-2920220
17.7%
20-2417265
15.1%
30-3416476
14.4%
35-3913015
11.4%
40-4410260
9.0%
45-499401
8.2%
50-547379
 
6.5%
655915
 
5.2%
55-595698
 
5.0%
60-644184
 
3.7%
Other values (3)4331
 
3.8%

Most occurring characters

ValueCountFrequency (%)
-107828
19.3%
494886
17.0%
580403
14.4%
274970
13.4%
358982
10.6%
055564
10.0%
952264
9.4%
614357
 
2.6%
18261
 
1.5%
+5915
 
1.1%
Other values (2)4257
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number443944
79.6%
Dash Punctuation107828
 
19.3%
Math Symbol5915
 
1.1%

Most frequent character per category

ValueCountFrequency (%)
494886
21.4%
580403
18.1%
274970
16.9%
358982
13.3%
055564
12.5%
952264
11.8%
614357
 
3.2%
18261
 
1.9%
83930
 
0.9%
7327
 
0.1%
ValueCountFrequency (%)
-107828
100.0%
ValueCountFrequency (%)
+5915
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common557687
100.0%

Most frequent character per script

ValueCountFrequency (%)
-107828
19.3%
494886
17.0%
580403
14.4%
274970
13.4%
358982
10.6%
055564
10.0%
952264
9.4%
614357
 
2.6%
18261
 
1.5%
+5915
 
1.1%
Other values (2)4257
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII557687
100.0%

Most frequent character per block

ValueCountFrequency (%)
-107828
19.3%
494886
17.0%
580403
14.4%
274970
13.4%
358982
10.6%
055564
10.0%
952264
9.4%
614357
 
2.6%
18261
 
1.5%
+5915
 
1.1%
Other values (2)4257
 
0.8%

Area
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
O
31704 
AM
29737 
C
12745 
W
7005 
BP
6927 
Other values (6)
26026 

Length

Max length3
Median length1
Mean length1.459603659
Min length1

Characters and Unicode

Total characters166605
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowT
2nd rowO
3rd rowO
4th rowW
5th rowC
ValueCountFrequency (%)
O31704
27.8%
AM29737
26.1%
C12745
11.2%
W7005
 
6.1%
BP6927
 
6.1%
T5909
 
5.2%
S5601
 
4.9%
NL4160
 
3.6%
EC3899
 
3.4%
Wlg3869
 
3.4%
2021-04-26T16:56:29.046751image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
o31704
27.8%
am29737
26.1%
c12745
11.2%
w7005
 
6.1%
bp6927
 
6.1%
t5909
 
5.2%
s5601
 
4.9%
nl4160
 
3.6%
ec3899
 
3.4%
wlg3869
 
3.4%

Most occurring characters

ValueCountFrequency (%)
O31704
19.0%
A29737
17.8%
M29737
17.8%
C16644
10.0%
W10874
 
6.5%
B6927
 
4.2%
P6927
 
4.2%
N6748
 
4.1%
T5909
 
3.5%
S5601
 
3.4%
Other values (4)15797
9.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter158867
95.4%
Lowercase Letter7738
 
4.6%

Most frequent character per category

ValueCountFrequency (%)
O31704
20.0%
A29737
18.7%
M29737
18.7%
C16644
10.5%
W10874
 
6.8%
B6927
 
4.4%
P6927
 
4.4%
N6748
 
4.2%
T5909
 
3.7%
S5601
 
3.5%
Other values (2)8059
 
5.1%
ValueCountFrequency (%)
l3869
50.0%
g3869
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin166605
100.0%

Most frequent character per script

ValueCountFrequency (%)
O31704
19.0%
A29737
17.8%
M29737
17.8%
C16644
10.0%
W10874
 
6.5%
B6927
 
4.2%
P6927
 
4.2%
N6748
 
4.1%
T5909
 
3.5%
S5601
 
3.4%
Other values (4)15797
9.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII166605
100.0%

Most frequent character per block

ValueCountFrequency (%)
O31704
19.0%
A29737
17.8%
M29737
17.8%
C16644
10.0%
W10874
 
6.5%
B6927
 
4.2%
P6927
 
4.2%
N6748
 
4.1%
T5909
 
3.5%
S5601
 
3.4%
Other values (4)15797
9.5%

True_False
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
0
113707 
1
 
437

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters114144
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0113707
99.6%
1437
 
0.4%
2021-04-26T16:56:29.486546image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-26T16:56:29.628399image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0113707
99.6%
1437
 
0.4%

Most occurring characters

ValueCountFrequency (%)
0113707
99.6%
1437
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number114144
100.0%

Most frequent character per category

ValueCountFrequency (%)
0113707
99.6%
1437
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common114144
100.0%

Most frequent character per script

ValueCountFrequency (%)
0113707
99.6%
1437
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII114144
100.0%

Most frequent character per block

ValueCountFrequency (%)
0113707
99.6%
1437
 
0.4%

AgeGroup
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
MidAge
49152 
Adult
41415 
Old
23176 
Teenage
 
401

Length

Max length7
Median length5
Mean length5.03155663
Min length3

Characters and Unicode

Total characters574322
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAdult
2nd rowAdult
3rd rowAdult
4th rowMidAge
5th rowAdult
ValueCountFrequency (%)
MidAge49152
43.1%
Adult41415
36.3%
Old23176
20.3%
Teenage401
 
0.4%
2021-04-26T16:56:30.034588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-26T16:56:30.208864image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
midage49152
43.1%
adult41415
36.3%
old23176
20.3%
teenage401
 
0.4%

Most occurring characters

ValueCountFrequency (%)
d113743
19.8%
A90567
15.8%
l64591
11.2%
e50355
8.8%
g49553
8.6%
M49152
8.6%
i49152
8.6%
u41415
 
7.2%
t41415
 
7.2%
O23176
 
4.0%
Other values (3)1203
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter411026
71.6%
Uppercase Letter163296
 
28.4%

Most frequent character per category

ValueCountFrequency (%)
d113743
27.7%
l64591
15.7%
e50355
12.3%
g49553
12.1%
i49152
12.0%
u41415
 
10.1%
t41415
 
10.1%
n401
 
0.1%
a401
 
0.1%
ValueCountFrequency (%)
A90567
55.5%
M49152
30.1%
O23176
 
14.2%
T401
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin574322
100.0%

Most frequent character per script

ValueCountFrequency (%)
d113743
19.8%
A90567
15.8%
l64591
11.2%
e50355
8.8%
g49553
8.6%
M49152
8.6%
i49152
8.6%
u41415
 
7.2%
t41415
 
7.2%
O23176
 
4.0%
Other values (3)1203
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII574322
100.0%

Most frequent character per block

ValueCountFrequency (%)
d113743
19.8%
A90567
15.8%
l64591
11.2%
e50355
8.8%
g49553
8.6%
M49152
8.6%
i49152
8.6%
u41415
 
7.2%
t41415
 
7.2%
O23176
 
4.0%
Other values (3)1203
 
0.2%

Payment_Type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
AV
98803 
RPU
15341 

Length

Max length3
Median length2
Mean length2.134400407
Min length2

Characters and Unicode

Total characters243629
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAV
2nd rowAV
3rd rowAV
4th rowAV
5th rowAV
ValueCountFrequency (%)
AV98803
86.6%
RPU15341
 
13.4%
2021-04-26T16:56:30.599400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-26T16:56:30.742343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
av98803
86.6%
rpu15341
 
13.4%

Most occurring characters

ValueCountFrequency (%)
A98803
40.6%
V98803
40.6%
R15341
 
6.3%
P15341
 
6.3%
U15341
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter243629
100.0%

Most frequent character per category

ValueCountFrequency (%)
A98803
40.6%
V98803
40.6%
R15341
 
6.3%
P15341
 
6.3%
U15341
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Latin243629
100.0%

Most frequent character per script

ValueCountFrequency (%)
A98803
40.6%
V98803
40.6%
R15341
 
6.3%
P15341
 
6.3%
U15341
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII243629
100.0%

Most frequent character per block

ValueCountFrequency (%)
A98803
40.6%
V98803
40.6%
R15341
 
6.3%
P15341
 
6.3%
U15341
 
6.3%

logapplied
Real number (ℝ)

HIGH CORRELATION

Distinct1197
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.750004317
Minimum-6.907755279
Maximum9.392661929
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:30.929795image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-6.907755279
5-th percentile4.663439094
Q15.347107531
median5.768320996
Q36.214608098
95-th percentile6.779921907
Maximum9.392661929
Range16.30041721
Interquartile range (IQR)0.8675005677

Descriptive statistics

Standard deviation0.6409119857
Coefficient of variation (CV)0.1114628704
Kurtosis1.601402047
Mean5.750004317
Median Absolute Deviation (MAD)0.4307829161
Skewness-0.2494680754
Sum656328.4928
Variance0.4107681735
MonotocityNot monotonic
2021-04-26T16:56:31.164146image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.3471075312941
 
2.6%
5.9914645472653
 
2.3%
5.7037824752628
 
2.3%
6.2146080982479
 
2.2%
5.8579331542341
 
2.1%
5.5214609182255
 
2.0%
5.2983173672166
 
1.9%
6.3969296552122
 
1.9%
5.4205349991987
 
1.7%
6.1092475831771
 
1.6%
Other values (1187)90801
79.5%
ValueCountFrequency (%)
-6.9077552791
 
< 0.1%
0.69314718064
< 0.1%
1.0986122892
 
< 0.1%
1.3862943611
 
< 0.1%
1.6094379125
< 0.1%
ValueCountFrequency (%)
9.3926619291
< 0.1%
8.6963430571
< 0.1%
8.3675324171
< 0.1%
8.346404871
< 0.1%
8.3204481141
< 0.1%

logreceived
Real number (ℝ)

HIGH CORRELATION

Distinct3125
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.750026064
Minimum-1.560647748
Maximum9.392661929
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:31.432034image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-1.560647748
5-th percentile4.663439094
Q15.347107531
median5.768320996
Q36.214608098
95-th percentile6.779921907
Maximum9.392661929
Range10.95330968
Interquartile range (IQR)0.8675005677

Descriptive statistics

Standard deviation0.6402153729
Coefficient of variation (CV)0.1113412993
Kurtosis0.4355168226
Mean5.750026064
Median Absolute Deviation (MAD)0.4321881782
Skewness-0.1963128263
Sum656330.975
Variance0.4098757237
MonotocityNot monotonic
2021-04-26T16:56:31.666359image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.3471075312938
 
2.6%
5.9914645472650
 
2.3%
5.7037824752624
 
2.3%
6.2146080982464
 
2.2%
5.8579331542339
 
2.0%
5.5214609182250
 
2.0%
5.2983173672160
 
1.9%
6.3969296552121
 
1.9%
5.4205349991971
 
1.7%
6.1092475831769
 
1.5%
Other values (3115)90858
79.6%
ValueCountFrequency (%)
-1.5606477481
 
< 0.1%
0.69314718063
< 0.1%
0.74193734471
 
< 0.1%
0.91629073191
 
< 0.1%
1.0986122891
 
< 0.1%
ValueCountFrequency (%)
9.3926619291
< 0.1%
8.6963430571
< 0.1%
8.3675324171
< 0.1%
8.346404871
< 0.1%
8.3205552421
< 0.1%

Ratio
Real number (ℝ≥0)

SKEWED

Distinct1993
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.001806261
Minimum0.8333333333
Maximum210
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:31.949934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.8333333333
5-th percentile1
Q11
median1
Q31
95-th percentile1
Maximum210
Range209.1666667
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6186145925
Coefficient of variation (CV)0.617499228
Kurtosis114143.6145
Mean1.001806261
Median Absolute Deviation (MAD)0
Skewness337.8511828
Sum114350.1738
Variance0.3826840141
MonotocityNot monotonic
2021-04-26T16:56:32.199874image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1109859
96.2%
0.997474747555
 
< 0.1%
0.99798387149
 
< 0.1%
0.997899159734
 
< 0.1%
0.999335989434
 
< 0.1%
0.997340425534
 
< 0.1%
0.996932515333
 
< 0.1%
0.996621621631
 
< 0.1%
0.997942386831
 
< 0.1%
0.99907063230
 
< 0.1%
Other values (1983)3954
 
3.5%
ValueCountFrequency (%)
0.83333333331
< 0.1%
0.93751
< 0.1%
0.941
< 0.1%
0.94444444441
< 0.1%
0.96428571431
< 0.1%
ValueCountFrequency (%)
2101
< 0.1%
1.0751
< 0.1%
1.061
< 0.1%
1.051
< 0.1%
1.0321
< 0.1%

Year
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size891.9 KiB
2019
30321 
2020
29656 
2018
26638 
2017
25478 
2016
 
2051

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters456576
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2019
5th row2017
ValueCountFrequency (%)
201930321
26.6%
202029656
26.0%
201826638
23.3%
201725478
22.3%
20162051
 
1.8%
2021-04-26T16:56:32.701054image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-26T16:56:32.859696image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
201930321
26.6%
202029656
26.0%
201826638
23.3%
201725478
22.3%
20162051
 
1.8%

Most occurring characters

ValueCountFrequency (%)
2143800
31.5%
0143800
31.5%
184488
18.5%
930321
 
6.6%
826638
 
5.8%
725478
 
5.6%
62051
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number456576
100.0%

Most frequent character per category

ValueCountFrequency (%)
2143800
31.5%
0143800
31.5%
184488
18.5%
930321
 
6.6%
826638
 
5.8%
725478
 
5.6%
62051
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common456576
100.0%

Most frequent character per script

ValueCountFrequency (%)
2143800
31.5%
0143800
31.5%
184488
18.5%
930321
 
6.6%
826638
 
5.8%
725478
 
5.6%
62051
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII456576
100.0%

Most frequent character per block

ValueCountFrequency (%)
2143800
31.5%
0143800
31.5%
184488
18.5%
930321
 
6.6%
826638
 
5.8%
725478
 
5.6%
62051
 
0.4%

Month
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.724724909
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:33.076294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.373737503
Coefficient of variation (CV)0.5016915263
Kurtosis-1.147591316
Mean6.724724909
Median Absolute Deviation (MAD)3
Skewness-0.1156909286
Sum767587
Variance11.38210474
MonotocityNot monotonic
2021-04-26T16:56:33.248127image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
810925
9.6%
710875
9.5%
1110558
9.2%
1010161
8.9%
910033
8.8%
69964
8.7%
59911
8.7%
39350
8.2%
128929
7.8%
28710
7.6%
Other values (2)14728
12.9%
ValueCountFrequency (%)
18284
7.3%
28710
7.6%
39350
8.2%
46444
5.6%
59911
8.7%
ValueCountFrequency (%)
128929
7.8%
1110558
9.2%
1010161
8.9%
910033
8.8%
810925
9.6%

Week
Real number (ℝ≥0)

HIGH CORRELATION

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.58487525
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:33.518169image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q115
median28
Q340
95-th percentile50
Maximum52
Range51
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.67184292
Coefficient of variation (CV)0.5318799811
Kurtosis-1.164333717
Mean27.58487525
Median Absolute Deviation (MAD)13
Skewness-0.1112622406
Sum3148648
Variance215.2629748
MonotocityNot monotonic
2021-04-26T16:56:34.056050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
482760
 
2.4%
492572
 
2.3%
512537
 
2.2%
312530
 
2.2%
262507
 
2.2%
332506
 
2.2%
272505
 
2.2%
252478
 
2.2%
352463
 
2.2%
382448
 
2.1%
Other values (42)88838
77.8%
ValueCountFrequency (%)
1855
 
0.7%
21838
1.6%
32278
2.0%
42216
1.9%
52051
1.8%
ValueCountFrequency (%)
521128
1.0%
512537
2.2%
502325
2.0%
492572
2.3%
482760
2.4%

Day
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.8257289
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:34.290369image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.664953992
Coefficient of variation (CV)0.5475232164
Kurtosis-1.155008974
Mean15.8257289
Median Absolute Deviation (MAD)7
Skewness0.005606195007
Sum1806412
Variance75.08142768
MonotocityNot monotonic
2021-04-26T16:56:34.521653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
204261
 
3.7%
194181
 
3.7%
134079
 
3.6%
213968
 
3.5%
173955
 
3.5%
113940
 
3.5%
273916
 
3.4%
123912
 
3.4%
243908
 
3.4%
53886
 
3.4%
Other values (21)74138
65.0%
ValueCountFrequency (%)
13278
2.9%
23438
3.0%
33480
3.0%
43594
3.1%
53886
3.4%
ValueCountFrequency (%)
312314
2.0%
303353
2.9%
293449
3.0%
283557
3.1%
273916
3.4%

Dayofweek
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.089457177
Minimum0
Maximum6
Zeros21644
Zeros (%)19.0%
Memory size891.9 KiB
2021-04-26T16:56:34.740353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile4
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.43620049
Coefficient of variation (CV)0.6873557908
Kurtosis-1.279331378
Mean2.089457177
Median Absolute Deviation (MAD)1
Skewness-0.0553214757
Sum238499
Variance2.062671848
MonotocityNot monotonic
2021-04-26T16:56:34.907127image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
424789
21.7%
323421
20.5%
222230
19.5%
021644
19.0%
121421
18.8%
5635
 
0.6%
64
 
< 0.1%
ValueCountFrequency (%)
021644
19.0%
121421
18.8%
222230
19.5%
323421
20.5%
424789
21.7%
ValueCountFrequency (%)
64
 
< 0.1%
5635
 
0.6%
424789
21.7%
323421
20.5%
222230
19.5%

Dayofyear
Real number (ℝ≥0)

HIGH CORRELATION

Distinct359
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean189.3833841
Minimum3
Maximum365
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:35.117681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile23
Q1100
median194
Q3276
95-th percentile345
Maximum365
Range362
Interquartile range (IQR)176

Descriptive statistics

Standard deviation102.7471838
Coefficient of variation (CV)0.5425353669
Kurtosis-1.158843022
Mean189.3833841
Median Absolute Deviation (MAD)87
Skewness-0.1055779516
Sum21616977
Variance10556.98378
MonotocityNot monotonic
2021-04-26T16:56:35.351969image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
213557
 
0.5%
227524
 
0.5%
354520
 
0.5%
311514
 
0.5%
226513
 
0.4%
332507
 
0.4%
318501
 
0.4%
331500
 
0.4%
304499
 
0.4%
276498
 
0.4%
Other values (349)109011
95.5%
ValueCountFrequency (%)
3192
0.2%
4215
0.2%
5145
0.1%
6172
0.2%
7191
0.2%
ValueCountFrequency (%)
365188
0.2%
364147
0.1%
363141
0.1%
362155
0.1%
361292
0.3%

Is_month_end
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size111.6 KiB
False
110280 
True
 
3864
ValueCountFrequency (%)
False110280
96.6%
True3864
 
3.4%
2021-04-26T16:56:35.557480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_month_start
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size111.6 KiB
False
110866 
True
 
3278
ValueCountFrequency (%)
False110866
97.1%
True3278
 
2.9%
2021-04-26T16:56:35.635585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_quarter_end
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size111.6 KiB
False
113334 
True
 
810
ValueCountFrequency (%)
False113334
99.3%
True810
 
0.7%
2021-04-26T16:56:35.729281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_quarter_start
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size111.6 KiB
False
113377 
True
 
767
ValueCountFrequency (%)
False113377
99.3%
True767
 
0.7%
2021-04-26T16:56:35.838630image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_year_end
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size111.6 KiB
False
114015 
True
 
129
ValueCountFrequency (%)
False114015
99.9%
True129
 
0.1%
2021-04-26T16:56:35.916738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_year_start
Boolean

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size111.6 KiB
False
114144 
ValueCountFrequency (%)
False114144
100.0%
2021-04-26T16:56:36.012902image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Elapsed
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1123
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1547631170
Minimum1480550400
Maximum1606694400
Zeros0
Zeros (%)0.0%
Memory size891.9 KiB
2021-04-26T16:56:36.153525image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1480550400
5-th percentile1488153600
Q11516233600
median1550016000
Q31579132800
95-th percentile1602028800
Maximum1606694400
Range126144000
Interquartile range (IQR)62899200

Descriptive statistics

Standard deviation36734422.09
Coefficient of variation (CV)0.02373590219
Kurtosis-1.19025494
Mean1547631170
Median Absolute Deviation (MAD)31276800
Skewness-0.1176796452
Sum1.766528123 × 1014
Variance1.349417766 × 1015
MonotocityNot monotonic
2021-04-26T16:56:36.434672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1577059200204
 
0.2%
1576800000184
 
0.2%
1596153600183
 
0.2%
1595203200181
 
0.2%
1606694400180
 
0.2%
1603756800180
 
0.2%
1600646400179
 
0.2%
1564358400178
 
0.2%
1572825600174
 
0.2%
1576713600172
 
0.2%
Other values (1113)112329
98.4%
ValueCountFrequency (%)
1480550400117
0.1%
148063680093
0.1%
148089600081
0.1%
148098240088
0.1%
1481068800148
0.1%
ValueCountFrequency (%)
1606694400180
0.2%
160652160012
 
< 0.1%
1606435200172
0.2%
1606348800162
0.1%
1606262400149
0.1%

Interactions

2021-04-26T16:55:43.531403image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:44.063126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:44.404244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:44.874214image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:45.211602image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:45.519987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:45.843093image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:46.261397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:46.565509image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:46.894184image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:47.197572image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:47.501195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:47.814062image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:48.095310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:48.401919image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:48.718343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:49.003579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:49.291738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:49.637100image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:49.966271image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:50.299188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:50.624991image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:50.946355image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:51.280931image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:51.673965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:52.028019image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:52.349245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:52.677050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:52.989088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:53.313953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:53.645796image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:53.964465image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:54.248427image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:54.575818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:54.864681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:55.309811image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:55.668507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:56.076677image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:56.367278image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:56.685466image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:57.000220image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:57.312384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:57.609583image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:57.910778image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:58.174044image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:58.444372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:58.725556image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:58.991116image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:59.256681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:59.549058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:55:59.875186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:00.183784image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:00.496848image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:00.817708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:01.151231image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:01.444636image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:01.731477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:02.020010image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:02.368821image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:02.760773image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:03.079435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:03.374655image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:03.646774image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:03.951986image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:04.235562image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:04.485499image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:04.783523image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:05.064675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:05.376990image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:05.696131image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:06.017536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:06.332137image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:06.663049image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:07.132423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:07.420249image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:07.706611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:07.994947image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:08.267247image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:08.582652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:08.880784image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:09.189210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:09.504850image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:09.801684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:10.104348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:10.375126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:10.664695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:10.945886image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:11.229398image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:11.526173image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:11.825485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:12.128087image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:12.420443image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:12.719584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:13.047659image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:13.331225image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:13.628029image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:13.927186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:14.208370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:14.506536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:14.836957image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:15.149419image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:15.448580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:15.761032image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:16.075930image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:16.371547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:16.656270image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:16.954402image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:17.251168image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:17.557564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-26T16:56:17.873740image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-04-26T16:56:36.749394image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-04-26T16:56:37.375586image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-04-26T16:56:37.956003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-04-26T16:56:38.568641image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-04-26T16:56:39.197195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-04-26T16:56:18.756451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-04-26T16:56:21.927713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

AppliedGenderPayment_MethodLocationReceivedIdReasonAgeAreaTrue_FalseAgeGroupPayment_TypelogappliedlogreceivedRatioYearMonthWeekDayDayofweekDayofyearIs_month_endIs_month_startIs_quarter_endIs_quarter_startIs_year_endIs_year_startElapsed
095.0MAVM95.0GHI001040140RA25-29T0AdultAV4.5538774.5538771.020201042164290FalseFalseFalseFalseFalseFalse1602806400
190.0MAVU90.0GHI000096195RA20-24O0AdultAV4.4998104.4998101.02020520132134FalseFalseFalseFalseFalseFalse1589328000
285.0FAVM85.0GHI000873216RA20-24O0AdultAV4.4426514.4426511.02020940291273FalseFalseFalseFalseFalseFalse1601337600
3425.0FAVM425.0GHI000165164RA40-44W0MidAgeAV6.0520896.0520891.02019728102191FalseFalseFalseFalseFalseFalse1562716800
4450.0FAVM450.0GHI000085542RA25-29C0AdultAV6.1092486.1092481.02017938180261FalseFalseFalseFalseFalseFalse1505692800
5370.0MAVM370.0GHI000100619RA25-29C0AdultAV5.9135035.9135031.02017832114223FalseFalseFalseFalseFalseFalse1502409600
6590.0FAVM590.0GHI000077859RA25-29O0AdultAV6.3801236.3801231.02020938162260FalseFalseFalseFalseFalseFalse1600214400
7210.0FAVM210.0GHI000112669RA50-54O0OldAV5.3471085.3471081.020181424224FalseFalseFalseFalseFalseFalse1516752000
8680.0FAVM680.0GHI000307469RA30-34O0MidAgeAV6.5220936.5220931.02018121310365TrueFalseTrueFalseTrueFalse1546214400
9220.0MAVM220.0GHI000612397RA65+AM0OldAV5.3936285.3936281.0201962374158FalseFalseFalseFalseFalseFalse1559865600

Last rows

AppliedGenderPayment_MethodLocationReceivedIdReasonAgeAreaTrue_FalseAgeGroupPayment_TypelogappliedlogreceivedRatioYearMonthWeekDayDayofweekDayofyearIs_month_endIs_month_startIs_quarter_endIs_quarter_startIs_year_endIs_year_startElapsed
114134230.0MAVO230.00GHI000077856RA50-54AM0OldAV5.4380795.4380791.00000020181252240358FalseFalseFalseFalseFalseFalse1545609600
114135290.0FAVNE290.00GHI001459210RA65+S0OldAV5.6698815.6698811.0000002019728113192FalseFalseFalseFalseFalseFalse1562803200
114136260.0FRPM260.00GHI001324033RA35-39S0MidAgeRPU5.5606825.5606821.0000002017417284118FalseFalseFalseFalseFalseFalse1493337600
114137580.0MAVM580.00GHI000088371RA25-29AM0AdultAV6.3630286.3630281.00000020181250144348FalseFalseFalseFalseFalseFalse1544745600
114138106.0MAVPP106.00GHI000112669RA60-64O0OldAV4.6634394.6634391.00000020181148271331FalseFalseFalseFalseFalseFalse1543276800
114139210.0FAVM210.00GHI001304576RA25-29BP0AdultAV5.3471085.3471081.000000201793684251FalseFalseFalseFalseFalseFalse1504828800
114140480.0MAVM480.00GHI000222992RA20-24EC0AdultAV6.1737866.1737861.00000020201044271301FalseFalseFalseFalseFalseFalse1603756800
114141344.0FAVO344.12GHI000722944RA35-39AM0MidAgeAV5.8406425.8409901.00034920172928159TrueFalseFalseFalseFalseFalse1488240000
114142220.0FAVNE220.00GHI001224192RA50-54W0OldAV5.3936285.3936281.0000002017835280240FalseFalseFalseFalseFalseFalse1503878400
114143226.0FAVPP226.00GHI000076617RA25-29O0AdultAV5.4205355.4205351.00000020191044302303FalseFalseFalseFalseFalseFalse1572393600

Duplicate rows

Most frequent

AppliedGenderPayment_MethodLocationReceivedIdReasonAgeAreaTrue_FalseAgeGroupPayment_TypelogappliedlogreceivedRatioYearMonthWeekDayDayofweekDayofyearIs_month_endIs_month_startIs_quarter_endIs_quarter_startIs_year_endIs_year_startElapsedcount
911226.0FAVM226.0GHI000112669RA25-29O0AdultAV5.4205355.4205351.02019834190231FalseFalseFalseFalseFalseFalse15661728008
895226.0FAVM226.0GHI000112669RA20-24O0AdultAV5.4205355.4205351.0202031216076FalseFalseFalseFalseFalseFalse15843168006
934226.0FAVM226.0GHI000112669RA25-29O0AdultAV5.4205355.4205351.020191252230357FalseFalseFalseFalseFalseFalse15770592006
961226.0FAVM226.0GHI000112669RA30-34O0MidAgeAV5.4205355.4205351.0201983113213FalseTrueFalseFalseFalseFalse15646176006
281208.0FAVM208.0GHI000112669RA20-24O0AdultAV5.3375385.3375381.02016124983343FalseFalseFalseFalseFalseFalse14811552005
283208.0FAVM208.0GHI000112669RA20-24O0AdultAV5.3375385.3375381.020161251223357FalseFalseFalseFalseFalseFalse14823648005
299208.0FAVM208.0GHI000112669RA20-24O0AdultAV5.3375385.3375381.0201731220079FalseFalseFalseFalseFalseFalse14899680005
304208.0FAVM208.0GHI000112669RA25-29O0AdultAV5.3375385.3375381.02016124972342FalseFalseFalseFalseFalseFalse14810688005
458210.0FAVM210.0GHI000112669RA20-24O0AdultAV5.3471085.3471081.020171146141318FalseFalseFalseFalseFalseFalse15106176005
507210.0FAVM210.0GHI000112669RA25-29O0AdultAV5.3471085.3471081.020171044311304TrueFalseFalseFalseFalseFalse15094080005